29 research outputs found

    SOM-VAE: Interpretable Discrete Representation Learning on Time Series

    Full text link
    High-dimensional time series are common in many domains. Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations. However, most representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose a new representation learning framework building on ideas from interpretable discrete dimensionality reduction and deep generative modeling. This framework allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. We introduce a new way to overcome the non-differentiability in discrete representation learning and present a gradient-based version of the traditional self-organizing map algorithm that is more performant than the original. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the representation space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model in terms of clustering performance and interpretability on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application on the eICU data set. Our learned representations compare favorably with competitor methods and facilitate downstream tasks on the real world data.Comment: Accepted for publication at the Seventh International Conference on Learning Representations (ICLR 2019

    Newtype single-layer magnetic semiconductor in transition-metal dichalcogenides VX 2 (X = S, Se and Te)

    Get PDF
    We present a newtype 2-dimensional (2D) magnetic semiconductor based on transition-metal dichalcogenides VX2 (X = S, Se and Te) via first-principles calculations. The obtained indirect band gaps of monolayer VS2, VSe2, and VTe2 given from the generalized gradient approximation (GGA) are respectively 0.05, 0.22, and 0.20 eV, all with integer magnetic moments of 1.0 μB. The GGA plus on-site Coulomb interaction U (GGA + U) enhances the exchange splittings and raises the energy gap up to 0.38~0.65 eV. By adopting the GW approximation, we obtain converged G0W0 gaps of 1.3, 1.2, and 0.7 eV for VS2, VSe2, and VTe2 monolayers, respectively. They agree very well with our calculated HSE gaps of 1.1, 1.2, and 0.6 eV, respectively. The gap sizes as well as the metal-insulator transitions are tunable by applying the in-plane strain and/or changing the number of stacking layers. The Monte Carlo simulations illustrate very high Curie-temperatures of 292, 472, and 553 K for VS2, VSe2, and VTe2 monolayers, respectively. They are nearly or well beyond the room temperature. Combining the semiconducting energy gap, the 100% spin polarized valence and conduction bands, the room temperature TC, and the in-plane magnetic anisotropy together in a single layer VX2, this newtype 2D magnetic semiconductor shows great potential in future spintronics

    Machine Learning Approaches for Patient Monitoring in the Intensive Care Unit

    No full text
    Patient monitoring in the ICU abounds with challenges that can be addressed using modern machine learning methods. One of the most pressing issues is how to distill features, i.e. vector representations, from the dynamically changing time-series; in other words, how to summarize the patient’s health state to solve relevant tasks, such as early prediction of organ failure in a data-driven way, departing from traditional ICU risk scores. Moreover, it is not clear how to proceed if labeled data is scarce or not available, and we wish to learn the patient’s health state evolution throughout the stay using purely unsupervised techniques. In the first part of the thesis, we construct a large-scale framework that includes patient-adaptive imputation, variable selection using SHAP values, feature construction to summarize physiological trends and patterns at multiple time-scales, and supervised machine learning methods to derive continuous ICU risk scores for critical organ failure. We apply our framework to the early prediction of circulatory and respiratory failure in retrospective data from the University Hospital Bern. For circulatory failure we show that the proposed alarm system circEWS-lite has an alarm precision of 40% while predicting 80% of failure events early. It drastically reduces the number of false alarms when compared to a decision tree baseline using MAP and lactate, as well as a traditional alarm system that generates alarms if individual variables are out of range. circEWS-lite was externally validated in the MIMIC-III cohort, resulting in close-to-identical performance as in the development cohort. For predicting respiratory failure up to 24h before the event, we propose respEWS-lite, which has an alarm precision of 40% while predicting 90% of critical events early. It outperforms two decision tree baselines using a small set of variables that would be traditionally considered in an ICU. In the second part of the thesis, we present a method that extracts morphological, spectral energy and cerebral auto-regulation descriptors from high-frequency intracranial waveforms, and computes features that describe historical trend and variation patterns of these metrics. We apply our methodology to the prediction of intracranial hypertension secondary to traumatic brain injury up to 8 hours in advance. Our results indicate that our proposed risk score outperforms several baselines from the literature on the MIMIC-III waveform database. We further show that the inclusion of features extracted from high-frequency waveforms increases performance significantly over minute-by-minute summaries which have been used in the best-performing prediction model for this task known in the literature. In the last part of the thesis, we shift the focus to unsupervised representations derived without labeled data. Specifically, we (a) analyze architectures that inductively bias representations by reconstructing the future and output continuous health state representations, and (b) show how to learn discrete representations/clusterings, which are optimized for human interpretability and allow to visualize patient health state trajectories on an easily understandable 2D map. Our results indicate that Seq2Seq auto-encoding models that predict the future using an attention head outperform several baselines in forecasting future trajectories and solving down-stream tasks using limited labeled data. Further, the proposed model for discrete clustering of patient health states, T-DPSOM, outperforms several baselines in clustering and forecasting trajectories, while improving upon popular visualization methods like t-SNE, providing a better spatial coherence of the clusterings in terms of clinically tangible concepts like a dynamic variant of the APACHE score. Overall, we proposed a family of supervised and unsupervised techniques that could alert clinicians in the ICU to critical organ failure and to better understand the patient’s health state trajectory throughout the stay with innovative visualization techniques built on top of clustering and representation learning. Future work will focus on improving the performance of the proposed risk scores by the integration of additional data modalities like images and text data. Further, a challenge that still remains is to find principled approaches to learn such risk scores on data pooled from several hospitals, taking into account distribution shift between different cohorts and centers. In this future paradigm, data-driven risk scores can be iteratively updated to new locations and context, minimizing manual data harmonization approaches and the need to pool data at a central location. We hypothesize that this requires bespoke neural network architectures based on emergent machine learning paradigms like meta-learning or continual learning

    PriMa: A low-cost, modular, open hardware, and 3D-printed fMRI manipulandum

    No full text
    Motor actions in fMRI settings require specialized hardware to monitor, record, and control the subjects behavior. Commercially available options for such behavior tracking or control are very restricted and costly. We present a novel grasp manipulandum in a modular design, consisting of MRI-compatible, 3D printable buttons and a chassis for mounting. Button presses are detected by the interruption of an optical fiber path, which is digitized by a photodiode and subsequent signal amplification and thresholding. Two feedback devices (manipulanda) are constructed, one for macaques (Macaca mulatta) and one for human use. Both devices have been tested in their specific experimental setting and possible improvements are reported. Design files are shared under an open hardware license

    Neighborhood Contrastive Learning Applied to Online Patient Monitoring

    No full text
    Intensive care units (ICU) are increasingly looking towards machine learning for methods to provide online monitoring of critically ill patients. In machine learning, online monitoring is often formulated as a supervised learning problem. Recently, contrastive learning approaches have demonstrated promising improvements over competitive supervised benchmarks. These methods rely on well-understood data augmentation techniques developed for image data which do not apply to online monitoring. In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). Our objective explicitly groups together contiguous time segments from each patient while maintaining state-specific information. Our experiments demonstrate a marked improvement over existing work applying contrastive methods to medical time-series.ISSN:2640-349

    Neighborhood Contrastive Learning Applied to Online Patient Monitoring

    No full text
    Intensive care units (ICU) are increasingly looking towards machine learning for methods to provide online monitoring of critically ill patients. In machine learning, online monitoring is often formulated as a supervised learning problem. Recently, contrastive learning approaches have demonstrated promising improvements over competitive supervised benchmarks. These methods rely on well-understood data augmentation techniques developed for image data which do not apply to online monitoring. In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). Our objective explicitly groups together contiguous time segments from each patient while maintaining state-specific information. Our experiments demonstrate a marked improvement over existing work applying contrastive methods to medical time-series.ISSN:2640-349
    corecore